February 1, 2008 



Relation of the Bell inequalities with 
quantum logic, hidden variables and 
information theory 

Emilio Santos 

Departamento de Fisica. Universidad de Cantabria. Santander. Spain 
Abstract 

I review the relation of the Bell inequalities - characteristic of (classical) 
probabilities defined on Boolean logics - with noncontextual and local hidden 
variables theories of quantum mechanics and with quantum information. 
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Quantum mechanics looks radically different from all classical theories 
of physics. Is that difference fundamental or is it due to our present lack 
of understanding of quantum mechanics?. (That our understanding is not 
good enough is widely recognized.0) Several approaches have been followed 
in the attempt to answer the question. We comment here briefly on three 
of them, namely quantum logics, hidden variables and information theory, 
showing the relation of each one of these with the Bell inequalities. 

I. Quantum logic and quantum probability 

According to Birkhoff and von Neumann! the difference between quantum 
and classical theories is radical because it appears at the most fundamental 
level, the logic. The elements of a logic are the propositions which, using 
the language of physics, are observables having the possible values 1 (the 
proposition is true) or (false). Some pairs of propositions are related by 
the implication (A implies B if B is true whenever A is true). This binary 
relation endowes the logic with the mathematical structure of a partially 
ordered set ("poset"). Another binary relation associates every proposition 
with its negation (for each propostion A there exist another one. A', which is 
true if and only if the first is false). This makes the poset orthocomplemented. 
The internal operations "meet" and "join" endowes the poset with a richer 
structure making it an orthocomplemented lattice. Finally it is assumed that 
there exist the sure proposition /, always true, and the absurd proposition, 
always false, which makes the lattice complete. From now on any complete 
and orthocomplemented lattice will be called a logic. Classical logic is a 
distributive lattice and it is called a Boolean algebra. 

In the view of Birkhoff and von Neumann the structure of quantum logic 
may be derived from the correspondence between propositions and projection 
operators ( which we shall call projectors in the following.) Accordingly these 
authors postulated that the proposition associated to the projector P is true 
(or false), for a physical system in a given state, if the state- vector | is 

an eigenvector of P ^ or / — P j . This assumption gives rise to a trivalent 

logic where propositions may be, in addition to true or false, also undefined 
(which happens if | is neither an eigenvector of P nor an eigenvector of 
I — P.) As projectors are associated to closed subspaces of the Hilbert space, 
quantum logic has the mathematical structure of the set of closed subspaces. 
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From these assumptions it is straightforward to define the fundamental 
relation of order (or implication) of propositions. We say that, for two propo- 
sitions A and B we have A < B { or A ^ B) ii the subspace associated to 
B contains that associated to A. Hence the binary operations "meet", X, 
and "join" ,Y, may be defined in a natural form and it follows that the 
propositions form a lattice. The lattice is orthocomplemented (the subspaces 
assiciated to the proposition A and its negation A' are orthogonal) and com- 
plete (there exist the sure proposition, /, corresponding to the whole Hilbert 
space and its negation, corresponding to the null vector). Up to here 
everything is similar to what happens in classical logic. But the quantum 
lattice is not distributive (Boolean) at a difference with the classical one. As a 
conclusion the authors claimed that the non-Boolean character of the lattice 
of propositions is the essential chracteristic of quantum theory. The details 
may be seen in the original article.! 

In the 66 years elapsed since the work of Birkhoff and von Neumann 
many articles and several books have been devoted to the subject of quantum 
logic (see e.g. the book of Hooker!) , in many cases starting from different 
definitions of quantum propositions. Also some criticisms have aroused in 
the sense that "quantum logic" is not a true logic, but just a propositional 
calculus. Indeed in an "actual" logic the relations amongst proposition like 
A^BotAxB = C should be also considered propositions, which is not 
necesarily the case in a propositional calculus. But the commented approach 
to the logic of quantum mechanics is still widely used. 

In any logic (orthocomplemented and complete lattice) it is straightfor- 
ward to define a probability distribution (or "state"): 

Definition 1 IfC is a logic, a probability distribution is a mapping p:C [0, 1] 
with the axioms 

1) p(^) = 0,p{I) = 1, where ^ (I ) is the absurd (sure) proposition, 

2) If {Ai\ is a sequence such that Ai < A'j , A' being the negation of A, 
for all pairs i ^ j , then ^jP(A) = P (^A) , 

3) For any sequence {Ai], p(Ai) = 1 Wi ^ p (xAi) = 1, 

Thus from quantum logic, as defined by Birkhoff and von Neumann, we 
get quantum probability, whilst the classical. Boolean, logic provides the 
standard probability theory. Indeed the above axioms are simply a general- 
ization of the axioms of probability as stated by Kolmogorov. 
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II. The Bell inequalities 



A discrimination between classical and quantum probability is provided 
by the Bell inequalities, derived as follows-u For any two proposition A, B 
G £ we may define a function, d(A,B), by 

d{A,B) =p{Ar B)-p{AxB). (1) 

That function has the properties 

< d{A, B) < 1, d{A, A) = 0, d{A, A') = 1, (2) 

and provides some measure of the "distance" between two propositions in a 
given state (probability distribution). The function is called a metric (pseu- 
dometric) if the following additional property holds (does not hold) true 

d{A,B) = 0^ A = B, (3) 

but this property is not very relevant for our purposes. More important are 
the following triangle inequalities, which are (are not generally) fulfilled if the 
lattice is (is not) Boolean 

\d{A, B) - d{A, C)\ < d{B, C) < d{A, B) + d{A, C). (4) 

As the Boolean character provides the essential difference between classical 
and quantum theories, according to Birkhoff and von Neumann,!! we see 
that the triangle inequalites (^ give a criterium to distinguish both theories. 
These inegualities are closely related to the Bell inequalities as shown in the 
following,^ although in mathematical theory of probability the inequalities 
d^) were known well before Bell's work. 

In quantum mechanics, if we consider three compatible propositions, {A, 
B, C}, (associated to pairwise commuting projectors) the inequalities (^ 
hold true because the lattice of commuting projectors is distributive. On the 
other hand, if two of the propositions, say A and B, are not compatible then 
their distance is not defined because quantum mechanics does not provide 
a joint probability of two incompatible observables (and it is assumed that 
they cannot be measured simultaneously). However there are cuadrilateral 
inequalities, derived from the triangular ones (^ , which may be violated 
by quantum mechanics and tested empirically. In fact, if we consider four 
projectors {A, B, C, D} it is easy to see that the inequalities (H) lead to 

d{A,D) <d{A,B) + d{B,C) + d{C,D). (5) 
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At a diference with the inequahtes (^), now all four distances are defined 
in quantum mechanics if every pair involve commuting projectors (that is if 
[A,D] = [A,B] = [B,C] = [C,D] = 0). We see that the inequality ©and 
the other three obtained by permutations involving the four projectors are 
necessary conditions for the existence of a classical joint probability distribu- 
tion defined on the set of projectors. The are cases where quantum mechancis 
predicts violations of one ot the inequalities, which leads to Bell's theorem 
(see next section.) 

Inequality (|^) is equivalent to the following one 

Pb+Pc> Pab + Pec + Pcd - Pda, (6) 

where p^ (or Pas) is the probability that A (or AB) is true. This is called 
a Bell inequalityi and, in this form it was derived by Clauser and Horne.i 
Instead of projectors, taking the values or 1, we might use observables 
taking the values -1 or +1. They are trivially related to the projectors by 

a = 2A- 1,6 = 25-1, etc. (7) 

and the inequality (^ takes the form of Clauser- Horne-Shimony- Holt (CHSH)E 

\{ah) + {he) + {cd) - {ad)\<2, (8) 

where {ah) means the expectation value of the product of a and h. Therefore 
these, CHSH, and the Clauser-Horne inequalities (H) are equivalent. 



III. Hidden variables theories 

The question of hidden variables in quantum mechanics aroused soon af- 
ter the formulation of the theory during the years 1925-26. It was explicitly 
mentioned in the book by von Neumann in 1932,1 where he derived a cele- 
brated no hidden variables theorem. From that time many books and articles 
have been devoted to the subject. Nevertheless there is no sharp definition 
of hidden variables (HV) theory which is widely accepted. I propose the 
following: 

Definition 2 HV is a theory physically equivalent to quantum mechanics 
(that is giving the same predictions for all experiments) which has the formal 
structure of classical statistical mechanics. 
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The definition may be illustrated in the following table giving the cor- 
respondence of concepts in experiments, standard quantum theory and a 
possible HV theory: 



Table I. Correspondence of concepts 

EMPIRICAL QUANTUM THEORY 

physical system Hilbert space H 

state vector | \E') G H 

observable A self-adjoint operator A 

expectation value (\E' | ^ | \E') 

correlation if A S = S A, | A S | 



phase space A 
probability density p (A) 
function ^ (A) 
= jA{X)p{X)dX 
= jA{X)B{X)p{X)dX 



HV THEORY 



The parameter (or parameters) A is usually called the hidden variable. 
Two observables, A and B, which are associated to commuting operators, A 
and B, are said compatible. The correlation may be extended to more than 
two compatible observables. It is easy to sec that the latter equality implies 
the equality of the joint probability distributions of compatible observables. 

In fact, it is enough to substitute cxp(^i^A^ for A and exp[i^A (A)] for A (A) 
in the equality, and similarly for B, in order to show the equality of the char- 
acteristic function of the joint probability distribution. On the other hand, 
it is well known that quantum mechanics does not provide joint distributions 
of observables not compatible (the associated operators noncommuting) . For 
the sake of clarity, in the Table we have considered only quantum pure states. 
The most general states arc associated to density operators, p, whence the 
quantum expectation value and correlation should be written, respectively 



In order to make clear what is the content of the theorems against HV 
theories, discussed later, I propose the following 

Definition 3 A simple experiment consists of the preparation of a state of 
a physical system, followed by the evolution of the system and finishing with 
the measurement of a set of compatible observables. 

Definition 4 A composite experiment consists of several simple experiments 
with the same preparation and the same subsequent evolution, but measuring 
different sets of compatible observables in each simple experiment. 
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With these definitions we may state the following theorem: 



Theorem 5 For any simple experiment there exists a HV theory. 



Proof: The essential part of the proof is to show that for any state | \&) 
and two compatible observables A, B the expectation may be obtained in 
the form 

{^<\AB\^>) = j A{\)B (A) p (A) d\. (9) 

For simplicity we consider just two observables, but the generalization to 
any finite number is trivial. In order to proceed with the proof we recall that 
there exists a complete set of orthonormal vectors which are simultaneous 
eigenvectors of two commuting self-adjoint operators. Let us label | A) one 
of the common eigenvectors of A and B. Complete means that 



which leads to 



AB m) 



A)(A|c/A = l, (10) 



d\d\'d>:'{^ |A)(A| A |A')(A'| B |A")(A"| ^) 
ciA(^|A)(A|l|A)(A|S|A)(A|^) (11) 
d\\{^ I A)|^(A A A)(A B A). (12) 



This has the structure of the right side of eq.(H) provided we identify (A | 
A I A) with the function A{X) and | A)|^ with the density p (A) . Indeed 
the density is positive and normalized (the latter because eq.(|l^)). Eq.(p!ID 
follows from the equality 



(A I 1| A') = (A I A I A)5(A- A') 



(13) 



5 being Dirac's delta, which is a consequence of | A) and | A') being eigen- 
vectors of A. 

We see that hidden variables are always possible, a fact made clair by J. 
S. Bell in 1966.i However, some families of HV theories are excluded, for 
instance those in which expectations fulfil linear relations of the form 

{^\A + B\^)= j [A(A) + 5(A)]p(A)rfA, (14) 
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The impossibility of such HV theories is the content of von Neumann's the- 
orem mentioned above.! Assumption (0) is unphysical, as pointed out by 
Bell,i which shows that von Neumann's theorem is not very relevant. More 
physical requirements are noncontextuality and locality which we discuss in 
the following. 



IV. Noncontextual hidden variables 



Definition 6 A HV theory is noncontextual if there exists a joint probability 
distribution for all observables of the system (even if some of them are not 
compatible.) 



In particular this implies that the marginal for the variable A in the 
joint distribution of the compatible observables A and B is the same as the 
marginal for A in the joint distribution of the compatible observables A and 
C, even if B and C are not compatible. For this reason noncontextuality 
is sometimes stated saying that the result of measuring A does not depend 
on the context (in particular, the result is the same whether we measure A 
toghether with B or we measure A toghether with C; remember that A, B, C 
cannot be measured simultaneously, that is with the same experimental set 
up). The latter property is true in quantum mechanics, but the existence of 
a joint distribution is a stronger constraint. What is required is the existence 
of some function of all the observables, p(A,B,C...), which fulfils the mathe- 
matical properties of a joint probability distribution and it is such that the 
marginals for every subset of compatible observables is the same given by 
quantum mechanics. The said distribution is just a mathematical object (it 
cannot be measured if some of the observables are not compatible) but their 
mere existence puts constraints which may be tested empirically. 

It is not difficult to see that the existence of a joint distribution for the 
observables A, B, C,... is equivalent to the existence of a positive normalized 
function, p (A) of a variable or set of variables. A, and functions A(A) , B(A) , 
C(A) ... However a joint probability distribution cannot be obtained with the 
construction of eq.(|l^) if the observables are not compatible. This is because 
a complete orthonormal set of simultaneous eigenvectors of A, B, C,... may 
not exist if the operators do not commute pairwise. What may be obtained 
are several HV theories, one for each simple experiment. For instance, let us 
consider a composite experiment consisting of two simple ones. In the first. 
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where we measure A and B, a HV theory should provide the functions (A), 
Ai (A) , Bi (A). In the second, where we measure A and C, a HV theory 
would give p2 (A), A2 (A) , C2 (A) . The two HV theories toghether might be 
called a HV theory for the composite experiment. It would be noncontextual 
if Pi (A) = P2 (A) and Ai (A) = A2 (A) , if this does not happen it should be 
contextual. 

The impossibility of noncontextual theories is stablished by the following 

Theorem 7 Noncontextual HV theories do not exist for all (composite) ex- 
periments. 

This is usually called Kochen-Specker theoremS after the authors who 
proved it in 1967. However the theorem had been actually proved one year 
earlier by Belli and it is a rather direct consequence of a theorem proved 
in 1957 by Gleason.0 We shall give here a proof inspired in the celebrated 
theorem of Bell against local hidden variables.! 

Proof: It is enough to exhibit a particular type of composite experiment 
where the quantum predictions are incompatible with the existence of a joint 
probability distribution for all observables. We consider four dichotomic ob- 
servables, A, B, C and D, each of which may take the values or 1. We 
assume that A and C are not compatible, and B and D are also not compati- 
ble, the remaining pairs being compatible. The corresponding operators will 
be proyectors, i. e. A? = A, etc., all pairs commuting except 



A,C 



^0, B,D ^0. (15) 



Let us label p^ the probability of A = 1, pab the probability that A = B = 
1, etc. The existence of a joint distribution means that there are 15 positive 
quantities 

Pa,Pb,Pc,Pd,Pab,Pac,Pad,Pbc,Pbd,Pcd,Pabc,Pabd,Pacd,Pbcd,Pabcd, 

(16) 

which should fulfil the relations 

< Pabcd < Pabc < Pab < Pa < 1, (17) 

and those obtained by all permutations of the labels. Only 8 of these quan- 
tities may be measured (and they are predicted by quantum mechanics), 
namely 

Pa,Pb,PC:Pd,Pab,Pad,PbC:Pcd- (18) 
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The remaining 7 quantities cannot be measured, the corresponding observ- 
ables not being compatible, and quantum mechanics gives no value for them. 

The question is whether there exist 7 quantities fulfilling all constraints of 
the type (|T7p which added to the 8 measurable ones provide the desired joint 
probability distribution (|16|). Now a necessary condition for the existence 
of a joint probability distribution is the fulfiUement of the Bell inequalities 
discussed above. A sufficient condition is the fulfillement of the 4 Bell in- 
equalities obtained by suitable permutation of labels in (P) or (H) (that is 
substituting A for C or D for B or both).t3 The rest of the proof consists of 
showing that there are states and observables for which quantum mechanics 
violates the inequalities, which may be seen elsewhere, e. g..i 

V. Local hidden variables 

An important class of HV theories are local HV theories. The concept of 
local applies to EPR experiments. We call eprH an experiment where we 
prepare locally a system which is later divided in two subsystems, each of 
which moves in a different direction. Measurements on each subsystem are 
later made at space-like separation (in the sense of relativity theory). 

Definition 8 A HV theory is local if, for any EPR experiment where we may 
measure one of several observables, Ai, of the first subsystem and one of sev- 
eral observables, Bj, on the second, there exist a joint probability distribution 
for all the observables {Ai, Bj; i, j = 1, 2, ....}. 

The impossibility of local HV theories is stablished by the celebrated 
Bell's theorem of 19641 

Theorem 9 Local HV theories do not exist for all (EPR) experiments. 

Proof: The proof is the same as for noncontextual HV theories, but 
considering an EPR experiment. That is, the observables A, C belong to one 
subsystem and B, D to the other subsystem. In particular, this guarantees 
that the pairs {A, B}, {A, D}, {C, B}, {C, D} are compatible because they 
belong to spacelike separated regions (the condition that spacehke separated 
observables are compatible is called microcausality in quantum field theory). 

The class of local theories is wider than that of noncontextual HV theories 
because the constraints in their definition are weaker. Indeed in local theories 
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the existence of a joint distribution is only required for EPR experiments, 
but noncontextual theories assume it for aU experiments. Consequently the 
empirical disproof is easier for noncontextual theories than for local theories. 
In the former it is enough to perform a composite experiment where the 
measurments are made locally, the latter requires measurements at spacelike 
separation. 

The fact that the proofs of both theorems are very similar has been a 
source of misimdrstanding, like the assertion that locality is not needed in 
order to prove Bell's theorem. I hope that in our presentation the point is 
more clear. But in order to stress the distinction between noncontextual and 
local I give an illustrative example. 

Let us assume that we want to perform a test of Bell's inequality using 
two spin-1/2 particles prepared in a singlet (zero total spin) state. An exam- 
ple could be the dissociation of a molecule consisting of two sodium atoms. 
We should measure the spin components along two directions for each atom 
(in four different simple experiments, see section 3 for the definition of simple 
experiment). These directions define the four projectors involved in the Bell 
inequality. If the inequality is violated we would have an empirical disproof 
of noncontextual hidden variables theories. However if we want to test lo- 
cal theories, the measurements should be performed at space-like separation, 
which is a rather strong requirement. 

For instance, we might use two Stern-Gerlach apparatuses each of length 
L. If the atoms move at velocity v, in opposite directions, the duration of 
the measurement would be L/v. The condition that the measurements are 
space-like separated means that the distance, d, between the Stern-Gerlach 
apparatuses should fulfil 

d > 2L-. 

V 

This inequality involves the velocity of light as it should, locality (in the sense 
of Bell) being a relativistic concept. An estimate of the minimal distance is 
obtained if we use a typical energy involved in dissociation, say 1 eV, and L 
is of the order of a few centimeters. We get for the minimal distance several 
kilometers. Thus the empirical violation of local hidden variables is far more 
difficult than the violation of non- contextual ones. 
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VI. Quantum information 



The amount of information is quantified with the concept of entropy. In 
classical physics, if we have a continuous random variable, A, with a proba- 
bility distribution p (A) , the entropy, , as defined by Shannon is 

S^^-j p{X)logp{X)dX. (19) 

The quantum entropy was defined by von Neumann in terms of the density 
operator, p, with an expression which looks similar to that one, namely 

= -Tr(p log p). (20) 

In both cases S > and the entropy increases with the lack of information, 
so that the pure states (maximal information) corresponds to S = 0. 

There are two other properties which hold true for both classical and 
quantum entropy: 

Concavity: XS (pj + (1 - X)S (p,) < S (Ap, + (1 - A)p,) , < A < 1, 

where p^ stands for either the classical probability density, p^ (A) , or 
the quantum density operator, p^ and similarly p;, for a different probabilty 
density or density operator of the same system. 

Subadditivity: S {p^^) < S {p^) + S {p^) , 

where pi2 stands for either the classical probability density, pi2 (Ai, A2) , 
or the quantum density operator, 'pi2 , the subindex 1 (2) referring to the 
first (second) subsystem of a composite system, and we have 

PiiXi) =y"pi2(Ai,A2)(iA2,Pi = Tr^Pi2- (21) 

There is, however, a property which dramatically distinguish classical 
from quantum entropy. In fact in the case of a system consisting of two 
subsystems, the classical. Shannon's, entropy fulfils 

5^(pi2)>max{5^(p,),5^(p2)}, (22) 

whilst the quantum entropy fulfils the weaker triangle inequality 

^'^(Pi2)>|'5^(Pi)-^^(P2)|- (23) 
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In my opinion, the fact that the quantum entropy does not fulfil an in- 
equality similar to ( ^2]) is highly paradoxical, I would even say bizarre. In 
fact, (^) allows for the possibility that both S'^ (pi) and 5'*^ (^2) are posi- 
tive whilst S'^ (Pu) is zero. This should be interpreted saying that we have 
complete information about a composite system whilst we have incomplete 
information about every subsystem. This contrast with the classical, and 
intuitive, idea that full information about the whole means that we have 
complete information about every part. In my view this is indicative that 
the concept of "complete" information in quantum theory is not the same 
as in classical physics, and the different meanings of completeness has been 
the source of misunderstandings about the interpretation of quantum theory, 
e.g. in the debate between Einstein and Bohr. 

The violation of an inequality similar to is closely related to the 
violation of the Bell inequality. But in order to stablish the conection it 
is necessary to introduce the concept of linear entropy. Actually, although 
the definitions of entropy ([T9|) and (^0|) are standard and in some sense an 
optimum, it is possible to give alternative definitions of entropy which fulfil 
the essential properties of concavity and subadditivity. The most simple are 
the so called linear entropies 

= 1 - y p (A)' d\, = 1 - Tr (p2) . (24) 

The desired connection between linear entropy and the Bell inequalities 
has been studied by several authors in the last few years. For instance 
Horodecki et al.Ei proved that the inequality ( P^j ) is a sufficient condition 
for the Bell inequalities. A slightly stronger result may be stated as follows 

Theorem 10 The inequality 

MNS^^ (P12) + MN -M -N> NS^^ (pi) + MS'^^ (p^) , (25) 

where M and N are the dimensions of the Hilbert spaces of the two subsystems, 
is a sufficient condition for all Bell inequalities or (|^) which may be got 
using two dichotomic observables of each subsystem. 

Proof: We consider observables {a, b} for the first particle and {c, d} for 
the second, all of which may take values 1 or -1, and the associated operators, 
a, b ,c and d. We define the Bell operator, 13 , by 

B = a®b + c^b + c®d-a^d. (26) 
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It is easy to see that 

TrB = 0, Tr (^5^) = AMN, (27) 
and that the Bell inequahty (||) is violated if 

\f]\>2,P = Tr(^Bp,,y (28) 

(whilst quantum mechanics predicts just |/?| < 2^/2). Now the obvious in- 
equality 

1^-1-^ 1 ^ ^ 



Tr[p,,--p,®h-—Ii(»?2 + j^h®h + XBj >0,XeR, (29) 

where Ii{ I2 ) is the identity operator for the first (second) particle, gives a 
quadratic expression in the variable A. We get, after some algebra 



MNTr (pI,) - NTr (pi) - MTr {pi) >^{p'-A). 



(30) 



Hence the inequality (^) implies <2 , which proves the theorem. 

Actually the inequality (|25|) is rather strong, and therefore not very useful, 
if either M > 2 or N > 2 or both, and it is trivial if either M = 1 or N 
= 1. Consequently its main interest is the case M = N = 2, where it is 
a consequence of the inequality ( P^D characteristic of classical information 
theory. 
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